Symposium on Theoretical Aspects of Computer Science 2009 (Freiburg), pp. 159-170 
www.stacs-conf.org 



WEAK MSO WITH THE UNROUNDING QUANTIFIER 



0^ 
O 
O 
(N 

D 
tin 



MIKOLAJ BOJANCZYK 



University of Warsaw 

E-mail address: |bo j anSmimuw . edu . pi | 

URL: www . mimuw . edu . pi/ ~bo j an 



Abstract. A new class of languages of infinite words is introduced, called the max- 
regular languages, extending the class of tj-regular languages. The class has two equivalent 
descriptions: in terms of automata (a type of deterministic counter automaton), and in 
terms of logic (weak monadic second-order logic with a bounding quantifier). Effective 
translations between the logic and automata are given. 
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1. Introduction 

This paper introduces a new class of languages of infinite words, which are called 
max-regular languages, and include all w-regular languages. Max-regular languages can be 
described in terms of automata, and also in terms of a logic. A typical language in the class 
is the property "the distance between consecutive 6's is unbounded" , i.e. the language 

L = {a'^^ba'^^ba''^ . . . : Vm 3i > m} . (1.1) 

A practical motivation can be given for considering properties that speak of bounded 
distance; e.g. a formula of the logic in this paper could specify that a system responds to 
requests with bounded delay. We will begin, however, with a more fundamental motivation, 
which is the question: what is a regular language of infinite words? 

There is little doubt as to what is a regular language of finite words. For instance, 
the requirement that the Myhill-Nerode equivalence relation has finitely many equivalence 
classes uniquely determines which languages of finite words should be regular. Other no- 
tions, such as finite semigroups, or monadic-second order logic also point to the same class. 

For infinite words, however, there is more doubt. Of course, the class of w-regular 
languages has much to justify calling it regular, but some doubts remain as to its uniqueness. 
Consider, for instance, the language L mentioned above, or the set K of ultimately periodic 
words, i.e. words of the form wv'^ , say over alphabet a,b. None of these languages are 
w-regular. However, under the commonly accepted definition of Myhill-Nerode equivalence 
for infinite words, given by Arnold in [2j, both languages have exactly one equivalence class. 

Should these languages be called regular? If yes, what is the appropriate notion of regu- 
larity? In this paper we propose a notion of regular languages, which are called max-regular 
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languages, that captures the language L, but not the language K. This new notion has many 
properties that one would wish from regular languages. The class is (effectively) closed un- 
der boolean operations, including negation. There is a finite index Myhill-Nerode relation, 
and equivalence classes are regular languages of finite words. There is an automaton model, 
there is a logical description, and translations between the two are effective. Emptiness 
is decidable. Membership is decidable (although since we deal with infinite words, the 
membership test is for certain finitely presented inputs, such as ultimately periodic words). 

So, what is this new class? One definition is in terms of logic. The max-regular 
languages are the ones that can be defined by formulas of weak monadic second-order logic 
extended with the unbounding quantifier. The term "weak" means that only quantification 
over finite sets is allowed. The unbounding quantifier UX.ip{X) was introducecfl in [3], it 
says that the size of sets X satisfying (p{X) is unbounded, i.e. 

UX.ip{X) = /\3X { ip{X) A n<|X|<oo). (1.2) 

neN 

Monadic second-order logic with the unbounding quantifier for infinite trees was studied 
in [3], where an emptiness procedure was presented for formulas with restricted quantifi- 
cation patterns. This study was continued in [4J, where the models where restricted from 
infinite trees to infinite words, but the quantification patterns considered were more relaxed. 
However, no decision procedure was given in [4J for full monadic second-order logic with the 
unbounding quantifier, and the expressive power of the logic seemed to be far too strong 
for the techniques used (no undecidability results are known, though). 

The basic idea in this paper is to restrict the set quantification to finite sets (i.e. weak 
quantification), while keeping the unbounding quantifier. It turns out that with this re- 
striction, lots of the problems encountered in [4] are avoided, and the resulting class is 
surprisingly robust. Note that for infinite words and without unbounding quantification, 
weak monadic second-order logic has the same expressive power as full monadic second- 
order logic; this is no longer true when the unbounding quantifier is allowed (we prove this 
using topological techniques). 

The main contribution of this paper is Theorem 13.21 which shows that weak monadic 
second-order logic with the unbounding quantifier has the same expressive power as deter- 
ministic max-automata. A max-automaton is a finite automaton equipped with counters, 
which store natural numbers. The important thing is that the counters are not read dur- 
ing the run (and therefore do not infiuence the control of the automaton), which avoids 
the usual undecidability problems of counter machines. The counters are only used in the 
acceptance condition, which requires some counter values to be bounded, and some to be 
unbounded. 

To the best of the authors knowledge, quantifiers similar to the unbounding quantifier 
have only been considered in [3l H]. On the other hand, the idea to use automata with 
quantitative acceptance conditions, has a long history, going back to weighted automata of 
Schiitzenberger [llj (see ^ for a recent paper on weighted automata and related logics). 

The max-automata used in this paper are closely related to an automaton model that 
has been variously called a distance desert automaton in [10], a BS-automaton in [4J, or an 
R-automaton in [1]. One important application, see p!0], of these automata is that they can 



The quantifier introduced in [3] was actually the negation of U , saying that the size is bounded. 
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be used to solve the famous star-height problerr0, providing simpler techniques and better 
complexities than in the famous result of Hashiguchi [8]. (The reduction from the star- 
height problem is not to emptiness of the automata, but to something called limitedness.) 
Other problems that can be tackled using this type of automata include the star-height of 
tree languages [5] or the Mostowski index of w-regular languages [6] . 

2. The automaton 

We begin our presentation with the automaton model. 

A max-automaton has a finite set of states Q and a finite set of counters T. It also has 
a finite set of transitions. Each transition reads an input letter, changes the state, and does 
a finite sequence of counter operations. The counter operations are: 

c := c + 1. Increment counter c. 

c := 0. Reset counter c. 

output{c). Output the value of counter c. 

c := max{c, d). Store in counter c the maximal value of counters c, d. 

A max-automaton is run on an infinite word w € Ti^ . A run is an infinite sequence of 
transitions, with the usual requirement on consistency with the letters in the input word. 
Fix a run p. With each counter c € C, we associate the sequence counter values pc € N*UN'^ 
that have been output by the instruction output{c). These outputs are used by the accepting 
condition, which is a boolean combination of clauses: "the sequence pc is bounded". 

Note that with this acceptance condition, it is only the set of values in pc that matters, 
and not their order or multiplicity. This is unlike the parity condition (where multiplicity is 
important), or the S-condition of [4], where the sequence pc is required to tend to infinity. 

The toolkit of counter operations could be modified without aff'ecting the expressive 
power of max-automata. For instance, we could have an operation c := d, which is equivalent 
to c := followed by c := max{c,d). On the other hand, the output instruction can be 
removed (in this case, pc would contain all values of the counter during the run). The 
output operation can be simulated by the others as follows: for every counter c, we add 
a new output counter c', which is never incremented. Instead of doing output{c), we do 
c' := c. This way, the counter c' gets only the values that were output on the original 
counter c. 

Theorem 2.1. Emptiness is decidable for max-automata. 

Proof. The difficulty in the proof is dealing with the max operation. 

We will reduce the problem to a result from [4] . A direct and elementary proof can also 
be given. A U-automaton is a max-automaton that does not use the max operation, and 
where the acceptance condition is a positive boolean combination of clauses "counter c is 
unbounded" . 

Let .4 be a max-automaton that we want to test for emptiness. As is often the case, 
we will be searching not for an input word accepted by A, but for an accepting run of A 
(which is also an infinite word). Fix a single clause in the accepting condition, e.g. "counter 
c is unbounded". Below, we will show that the set of runs which satisfy this clause can be 
recognized by a nondeterministic U-automaton. In particular, the set of accepting runs of 

^This is the question of calculating the least number of nested stars in a regular expression (without 
negation) that defines a regular language L C E*. 
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^ is a boolean combination of languages accepted by U-automata. The result then follows 
from [1] , where emptiness is shown decidable for boolean combinations of nondeterministic 
U-automat41- 

Before we define the U-automaton that tests if counter c is unbounded, we introduce 
some auxiliary definitions. Let c, d be counters of the automaton A. Below we define what 
it means for a finite sequence of counter operations p to transfer c to d, possibly with an 
increment. (Formally, we are defining two ternary relations: T{p,c,d), for transfers, and 
TI{p, c, d), for transfers with an increment.) The idea is that after executing the operations 
p, the value of counter d is at least as big as the value of counter c before executing p. The 
definition of transfers is by induction on the length of p: 

• Every counter is transferred to itself by the empty sequence of operations, as well 
as the operations c := c + 1 and output{c). Furthermore, c := c + 1 also transfers c 
to itself with an increment. 

• The operation c := transfers every counter to itself, except c. 

• The operation c = max(c, d) transfers every counter to itself, and also d to c. 

• If a sequence of operations pi transfers c to e, and a sequence of operations p2 
transfers e to d, then their concatenation pip2 transfers c to d. If either of the 
transfers in pi or p2 does an increment, then so does the transfer in pip2- 

Note that the transfer relation is regular in the following sense: for any counters c and d, 
the set of words p that transfer counter c to d is a regular language of finite words, likewise 
for transfers with an increment. 

Let c be a counter. A finite sequence of positions xi < ■ ■ ■ < x„ in a run of A is 
called a c-loop if for any i < n, counter c is transferred to itself with an increment by the 
subrun between positions Xi to Xj+i. For a counter d, a d-trace is a sequence of positions 
xi < ■ ■ ■ < Xn < y such that for some counter c, the positions xi < ■ ■ ■ < Xn are a c-loop, 
and counter c is transferred to d by the subrun between positions Xn and y. 

Equipped with these definitions, we are ready to define a (nondeterministic) U-automaton 
that tests if counter c is unbounded in an input run. The U-automaton has only one counter, 
and it accepts if unbounded values are output to this counter. A run of this automaton 
(which inputs a run of the automaton A) proceeds as follows. It uses nondeterminism to 
guess a d-trace xi < ■ ■ ■ < Xn < y, and it increments its counter at each of the positions Xi. 
Once it sees position y, it outputs the counter value (which is n), and resets the counter. 
It then finds another d-trace, and again outputs its length, and so on. It is not difficult to 
verify the correctness of this construction. ■ 

In this paper, we will be mainly interested in deterministic max-automata. 
3. The logic 

We consider an extension of weak monadic second-order logic, called weak unbounding 
logic. Recall that weak monadic second-order logic is an extension of first-order logic that 
allows quantification over finite sets (the restriction to finite sets is the reason for the name 
"weak"). In weak unbounding logic, we further add the unbounding quantifier UX, as 
defined in (fL2]l . 



The result in [4] is for S-automata, which are more powerful than U-automata. It is shown that a boolean 
combination of S-automata is equivalent to a BS-automaton, which has decidable emptiness. 
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Example 3.1. Consider the set L from (jl.ip . This language is not regular, but defined by 
the following formula of weak unbounding logic: 

UX \/x <y <z x,z £ X ^ a{y) Ay e X 

The main result of this paper is that the logic and automata coincide, i.e. 

Theorem 3.2. Weak unbounding logic defines exactly the same languages as deterministic 
max-automata. 

The more difficult direction in Theorem 13.21 is presented in Section [H The easier direc- 
tion, where an automaton is simulated by the logic, can be shown by combining standard 
techniques with the concepts from the proof of Theorem 1 2. 1[ The key idea is that a formula 
of weak unbounding logic can test if a set of positions {xi < • • • < x„ < y} forms a d-trace. 
It is important that the automata are deterministic, which allows a formula of weak logic 
to uniquely decode the run that corresponds to the input word. 

The formulas that are sufficient to simulate a deterministic max-automaton are of a 
special type, which gives a normal form for weak unbounding logic: 

Proposition 3.3. Each formula of weak unbounding logic is equivalent to a boolean com- 
bination of formulas UXip{X), where f{X) does not use the unbounding quantifier. 

Proof. By translating a formula into an automaton and then back into a formula. ■ 

4. Weak bounding logic is captured by deterministic max-automata 

We now turn to the more difficult part of Theorem 13. 2^ namely showing that for every 
formula of weak unbounding logic there is an equivalent deterministic max-automaton. 

The proof is by induction on the size of the formula. To simplify the proof, we use 
the usual technique of removing ffist-order quantification, as in [13J. That is, ffist-order 
quantification is replaced by three new predicates, all of which can be recognized by the 
deterministic max-automata: "set X has one element", "set X is included in set Y" and 
"all elements of set X are before all elements of set Y" . Together with weak second-order 
quantification, these new three predicates can be used to simulate first-order quantification, 
so the logic is the same. However, since we have removed first-order quantification, in the 
translation to automata we only have to deal with quantification over finite sets (weak 
second-order quantification) and the new quantifier. 

For purposes of the induction, we generalize the statement to formulas with free vari- 
ables. What is the word language corresponding to a formula (p{Xi, . . . , X„)? This language 
contains words annotated with valuations for the free set variables. We use the usual en- 
coding, where the label of a word position x £ N is extended with a bit vector in {0, l}" 
that says which of the sets Xi, . . . ,Xn contain position x. More formally, for sets of word 
positions Xi, . . . , Xn C N and an infinite word w G S'^, we define the word 

w[Xi,...,Xn] G (sx{o,irr 

as follows. On position x, the new word has a tuple (a, 6i, . . . , with a the label of the 
x-th position of the original word w, and the value of bit hi being 1 if and only if position 
X belongs to the set Xi, for i = 1, . . . ,n. With this notation, we can define the set of words 
satisfying a formula ifiXi, . . . , X„) to be 

= {w[Xi, . . . ,Xn] : w,Xi, . . . ,Xn \= if} . 
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Equipped with the above definition, we can use induction to show that the logic is 
captured by automata, as stated in the proposition below. This result is the main ingredient 
in the proof of Theorem 13. 2i 

Proposition 4.1. For every formula (p of weak unbounding logic, the set is recognized 
by a deterministic max- automaton. 

The proof is by induction on the size of the formula (/?. The induction base, which 
corresponds to the predicates "set X has one element", "set X is included in set Y" and 
"all elements of set X are before all elements of set Y" is easy, since all of these are w-regular 
languages, and we have: 

Lemma 4.2. Deterministic max-automata capture all uj-regular languages. 

Proof. By simulating a deterministic automaton with the Muller or parity condition. We 
add a new counter Cq for each state q of the automaton. Each time state q appears, counter 
Cq is incremented and output. The counters are never reset. In a run of this automaton, a 
state appears infinitely often if and only if its counter is unbounded. Therefore, the Muller 
acceptance condition can be encoded in the unbounding condition of a max-automaton. ■ 

The induction step for boolean operations — including negation — is no more difficult, 
since the automata are deterministic and the accepting condition is closed under boolean 
operations. We are left with weak second-order quantification and the unbounding quanti- 
fier. We first deal with weak quantification, in Section 14.11 while the unbounded quantifier 
is treated in Section 14.21 

4.1. Weak existential quantification 

This section is devoted to showing: 

Proposition 4.3. Languages recognized by deterministic max-automata are closed under 
weak quantification. In other words, if L is a language over S x {0, 1} recognized by a 
deterministic max-automaton, then there is a deterministic max-automaton recognizing 

{w G S*^ : w[X] G L for some finite set X} . 

A convenient way to prove this result would be to use nondeterministic automata. 
Unfortunately, as we will later show, adding nondeterminism to max-automata gives power 
beyond that of weak unbounding logic, so we cannot use this strategy. We will have to do 
the existential quantification directly in the deterministic automata. 

The proof technique is actually very generic. It would work for any model of determin- 
istic automata that all w-regular languages and satisfies some relaxed assumptions, mainly 
that the acceptance condition is prefix-independent. 

Fix a deterministic max-automaton A that recognizes L, with state space Q. 

A partial run in an infinite word w is a, run that begins in any position of the word 
(not necessarily the first position) and in any state (not necessarily the initial one). In 
other words, this is a word in U _L", where 5 is the set of transitions of A, that is 

consistent with the word w on those positions where it is defined (i.e. where it is not _L). 
Since the automaton is deterministic, a partial run is uniquely specified by giving the first 
configuration where it is defined, this is called the seed configuration. (There is also the 
undefined partial run X"^, which has no seed configuration.) Here, a configuration is a pair 
(g, x), where g is a state and x is a word position. Note that we do not include the counter 
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values in the seed configuration, since the acceptance condition is not sensitive to finite 
perturbations. 

We say that two partial runs converge if they agree from some position on. Equivalently, 
they converge if they share some configuration, or both are undefined. We say a set of 
partial runs spans a word w if every partial run over w converges with some run from the 
set. Usually, we will be interested in finite sets of spanning runs. 

Lemma 4.4. For every word w, there is a set of at most \Q\ spanning runs. 

Proof. We begin with some arbitrary configuration, and take the partial run pi that begins 
in that configuration. If {pi} is spanning, then we are done. Otherwise, we take some 
partial run p2 that does not converge with pi, and see if the set {pi,p2} is spanning. If it 
is not, we add a third partial run p^, and so on. This process terminates after at most Q 
steps, because if two partial runs do not converge, then they must use different states on 
each position where they are both defined. So \Q\ partial runs that do not converge will 
use up all the states. ■ 

To prove Proposition 14.31 we use a result stronger than Lemma 14.41 We will show that 
not only the spanning set of runs exists, but it can also be computed by a (deterministic, 
letter-to- letter) transducer. By transducer we mean a finite deterministic automaton where 
each transition is equipped with an output letter, from an output alphabet F. Therefore, 
the transducer defines a function / : T,^ — > L"^. The transducer does not have any accepting 
conditions (using bounds or even parity or Muller), it just scans the word and produces its 
output. It is easy to see that deterministic max- automata are closed under preimages of 
transducers, as shown in the following lemma. 

Lemma 4.5. // / is a transducer and A is a deterministic max- automaton, then there is 
a deterministic max-automaton recognizing the set of words w such that f{w) is accepted 
by A. 

We now describe how the spanning partial runs will be encoded in the output of the 
transducer. When speaking of spanning partial runs, we mean spanning partial runs of the 
automaton A in Proposition 14. 3[ A single partial run can be encoded as an infinite word 
over the alphabet Q x {0, 1}. The idea is that {0, 1} is used as a marker, with meaning 
"ignore the prefix until this position" , and 1 meaning "do not ignore" . Formally, an infinite 
word 

{qi,ai){q2,a2),... G (Q x {0, 1})" 

is interpreted as the partial run which on position i has _L if aj = for some j > i, otherwise 
it has Qi. Note that if the word above has infinitely many positions j with aj = 0, then the 
partial run is nowhere defined, i.e. it is _L°°. If we want to encode n partial runs, we use n 
parallel word sequences, encoded as a single sequence over the product alphabet 

(Qx{o,i}r . 

With the encoding of spanning runs defined, we are now ready to present the stronger 
version of Lemma 14.41 

Lemma 4.6. Let n = \Q\. There is a transducer 

f: T.'^ ^ ((gx{0,l}rr 
such that for any word w, the output f{w) encodes n spanning partial runs. 
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Proof. The idea is to implement the proof of Lemma 14.41 in a transducer. The states of the 
transducer wih be permutations of the state space, i.e. tuples from where each state 
appears exactly once. The initial state is any arbitrarily chosen permutation. When reading 
an input letter o in state vr = (gi,...,g„), the transducer does the following operations. 
First, it transforms each state in tt according to the letter a, giving a tuple x = {qia, . . . , qnd). 
This tuple is not necessarily a permutation, i.e. there are may be some coordinates i € 
{1, . . . , n} such that the state qiU appears already in {qia, . . . , gj_io}. Let I = {ii, . . . , ik} 
be these coordinates, and let {pi, . . . ,Pm} be the states that do not appear in the new tuple 
X. These two sets have the same size, i.e. k = m. We can now correct x to be a permutation 
a, by replacing its coordinate ii with the state pi, the coordinate 12 with state p2, and so 
on. Note that on a the coordinates from 7, the new permutation a has a value unrelated 
to the one from vr (i.e. a begins a new run), while on coordinates from outside /, the new 
permutation a simply continues the runs from tt. This is signified in the output of the 
transducer, which is decorates each coordinate i of the permutation a with a bit, which is 

when i G / and 1 otherwise. ■ 

We are now ready to prove Proposition 14.31 By properties of spanning sets of runs, 
a word w G T,^ belongs to the language of the proposition if and only if there is some 

1 = 1, . . . ,n such that the following two properties hold: 

(A) The i-th run encoded by f{w) is defined (i.e. the encoding does not contain infinitely 
many cancelling Os) and satisfies the accepting condition in the automaton A. 

(B) There is some finite set X C N such that the run of A over w[X] converges with the 
i-th run encoded by /(w). 

Since deterministic max-automata are closed under union, it suffices to show that for each 
fixed i, both properties (A) and (B) are recognized by deterministic max-automata. For 
property (A), we use Lemma 14.51 on preimages. Property (B), on the other hand, is an 
(j-regular property, which can be recognized by a deterministic max-automaton thanks to 
Lemma 14.21 

4.2. Unbounding quantification 

We now turn to the more difficult part of Proposition 14. H namely that deterministic 
max-automata are closed under unbounding quantification. 

Proposition 4.7. Languages recognized by deterministic max-automata are closed under 
unbounding quantification. In other words, if L is a language over S x {0, 1} recognized by 
a deterministic max-automaton, then so is 

UL = {w E S'^ : w[X] E L for arbitrarily large finite sets X} . 

Fix a deterministic max-automaton A recognizing the language L in the proposition. 
Given a finite prefix E S* and a state q of A, let max{q, w) be the maximal size of a set X 
of positions in w such that the automaton A reaches state q after reading We claim 

that the sets max{q, w) can be computed in the counters of a deterministic max-automaton 
(not surprisingly, using the max operation). 

Lemma 4.8. There is a deterministic max-automaton with counters {cq}q(zQ such that the 
value of Cq after reading a prefix oi • ■ ■ a„ of the input is exactly max{q, ai ■ ■ ■ an). 
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We will use the values from the above lemma to capture the unbounding quantifier. 
However, some more effort is needed: it is not the case that an input word w = aia2 • • • 
belongs to C/L if and only if the values max{q,ai ■ ■ - an) are unbounded. In general, only 
the left to right implication holds. The right to left implication may fail since a value 
max{q, ai ■ ■ ■ an) is relevant only if the run of A over w that begins in configuration {q, n) 
can be extended to an accepting one over the rest of the word. The correct characterization 
is given below: 

Lemma 4.9. A word 0102 •• • € S"^ belongs to UL if and only if for some state q, the 
following values are unbounded 

{max{q, oi • • • a„) : an+ian+2 • • • [0] G (S x {0, 1})'^ is accepted by A when starting in q} 

As suggested by the above lemma, to recognize the language UL it would be conve- 
nient to have an extension of max-automata, where the automaton would have the abil- 
ity to output max{q, oi • • • an) only in case a certain property was satisfied by the suffix 
an+ian+2 • ■ ■ • Below, we introduce such an extension of max-automata, which we call a 
guarded max-automaton. We then show that this extension can be simulated by a standard 
max-automaton, thus completing the proof of Proposition 14.71 

An guarded max-automaton is like a max-automaton, except that it is also allowed to 
use the following counter operation: 

if L then output{c). Output the value of counter c, but only if the suffix of the 
input beginning at the next position belongs to L C S'^ . 

In the above operation, the language L — called the guard of the transition — must be a 
language recognized by a max-automaton (without guards, although allowing guards would 
give the same result). This new operation is all we need to recognize the language UL: 

Lemma 4.10. // a language L is recognized by a deterministic max-automaton, then UL 
is recognized by a deterministic guarded max-automaton. 

We will show that guarded outputs are redundant, and can be simulated by non-guarded 
outputs. This completes the proof Proposition 14.71 The difficulty in the proof below is that 
we are dealing with deterministic automata, while a guard looks to the future. 

Proposition 4.11. For every deterministic guarded max-automaton there is an equivalent 
deterministic max-automaton. 

Proof. Let ^ be a deterministic guarded max-automaton. To simplify notation, we assume 
that only one guarded operation, 

= if L then output{c) , 

is used. The general case is done the same way. Let B he a deterministic max-automaton 
recognizing the guard language L. 

In the construction, we will use a concept of thread. A thread consists of a state of 
the automaton B, as well as a number, which corresponds to the value of counter c output 
by the guarded operation o. Note that a thread does not contain information about values 
of the counters of automaton B. The idea is that threads will be alive for only finitely 
many steps, so the counters of B are not relevant. We will denote threads by r. If a € S 
is an input letter, then we write ra for the thread obtained from r by updating the state 
according to a (and leaving the number unchanged). 
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The (non-guarded) max-automaton C that simulates A works as follows. At each point, 
the simulating automaton contains a finite set {ri, . . . , Tj} of active threads. There will be 
at most one thread per state of B, so the set of threads can be stored using finitely many 
counters and the finite memory of the automaton. This set of active threads is initially 
empty. Whenever A does the guarded output operation o, a new active thread is created, 
with the initial state of B, and the number set to the value of counter c. Furthermore, after 
reading an input letter a G S, the set of active threads is updated to {no, . . . , Tjo}. If two 
active threads have the same state, then they are merged, and only the greater number is 
kept (using the max operation). 

Similarly to the proof of Proposition 14. 3|. the automaton C will also read the output of a 
transducer / that computes spanning partial runs of the automaton B used for the guards. 
Recall that the transducer / outputs n spanning partial runs of the automaton B, where n 
is the number of states in B. 

The automaton C accepts a word w if and only if there is some i = 1, . . . ,n such that: 

(A) The i-th run encoded by f^w) is defined (i.e. the encoding does not contain infinitely 
many cancelling Os) and satisfies the accepting condition in the automaton B. 

(B) For every m, some thread storing a number greater than m converges with i-th run 
encoded by f{w). 

Since deterministic max-automata are closed under finite union, we only need to show 
the construction for some fixed i. As in the previous section, property (A) is recognized by 
a deterministic max-automaton. For property (B), it suffices to output the number stored 
in a thread r whenever its state is the same as in p,. The automaton then accepts if the 
numbers thus produced are unbounded. 

■ 

5. Problems with nondeterminism 

In this section we show that nondeterministic max-automata are more expressive than 
deterministic ones. 

Theorem 5.1. Nondeterministic max-automata recognize strictly more languages than de- 
terministic ones. 

Contrast this result with the situation for Muller or parity automata, which are equally 
expressive in the deterministic and nondeterministic variants. Since full monadic second- 
order can capture nondeterministic automata by existentially quantifying over infinite sets, 
the above theorem immediately implies: 

Corollary 5.2. Full monadic second-order logic with the unbounding quantifier is stronger 
than weak monadic second- order with the unbounding quantifier. 

The separating language in Theorem 15.11 is 

L = {a^^ba"'^ba"'^b . . . : some number appears infinitely often in ni, n2, . . .} . (5-1) 

This language is captured by a nondeterministic max-automaton. The automaton uses 
nondeterminism to output a subsequence of ni,n2,... and accepts if this subsequence is 
bounded. Clearly, if it is bounded, then it contains an infinite constant subsequence. 

It remains to show that the language L cannot be recognized by a deterministic max- 
automaton. For this, we will use topological complexity. In Lemmas 15.31 and 15.41 we 
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will show that every language recognized by a deterministic max-automaton is a boolean 
combination of sets on level S2 in the Borel hierarchy, while the language L is not. 

Below we briefly describe the Borel hierarchy, a way of measuring the complexity of 
a subset of a topological space. The topology that we use on words is that of the Cantor 
space, as described below. A set of infinite words (over a given alphabet S) is called open 
if it is a union 

(J WiYl^ Wi£^* , 

with the index set / being possibly infinite. In other words, membership of a word w in 
an open set is assured already by a finite prefix of w. For the Borel hierarchy, as far as 
max-automata are concerned, we will only be interested in the first two levels Si, Hi, S2, 112. 
The open subsets are called Si, the complements of these (the closed subsets) are called Hi. 
Countable intersections of open subsets are called 112, the complements of these (countable 
unions of closed subsets) are called S2. 

Lemma 5.3. Any language accepted by a deterministic max-automaton is a boolean com- 
bination of S2 sets. 

Proof. Fix a max-automaton A, and a counter c of this automaton. We will examine the 
topological complexity of the set of runs of this automaton (here, a run is an infinite sequence 
of transitions). For any fixed n, the following set of runs is clearly open: 

A value of at least n is output at least once on counter c. 
In particular, its complement 

All values of counter c are at most n. 
is a closed set of runs. By taking a countable union of the above over n G N, we deduce 
that the property 

The values of counter c are bounded, 
is a S2 property. In particular, the set of accepting runs of any max-automaton is a boolean 
combination of S2 sets. Since the automata are deterministic, the function that maps an 
input word to its run is continuous, i.e. preimages of open sets are also open. Since preimages 
of continuous functions preserve the levels of the hierarchy, we conclude that any language 
accepted by a deterministic max-automaton is a boolean combination of S2 sets. ■ 

Lemma 5.4. The language L is not a boolean combination of S2 sets. 

Proof. Consider the mapping from N* to {a, 5}*a; defined by 

ni,n2, ^ a^'W-'h... 

This is a continuous mapping. The language L is the image, under this mapping, of the set 
X of sequences in N'^ that have a bounded subsequence. The set X is known not to be a 
boolean combination of S2 sets, see Excercise 23.2 in [9]. ■ 
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6. Conclusion 

This paper is intended as a proof of concept. The concept is that w-regular languages 
can be extended in various ways, while still preserving good closure properties and decid- 
ability. The class presented in this paper, max-regular languages, is closed under boolean 
operations, inverse morphisms, and quotients. It is not closed under morphic images (which 
corresponds to nondeterminism on the automaton side). 

Some questions on max-automata are left unresolved. Is the max operation necessary 
in the automaton? In our construction, we use the max twice: when defining the values 
max{q,ai ■ ■ - an), and in Proposition 14.111 While in the first case, the max operation can 
be avoided by a subtle use of factorization forests [12], it is not clear how to show Propo- 
sition 14.111 without using the max operation. Another question is the exact complexity of 
emptiness. It would be nice to get matching upper and lower bounds, even more so if the 
lower bound would use acceptance conditions in DNF. 

There are several other possibilities of future work. One is to investigate weak bounding 
logic for infinite trees (note that we will not capture all regular languages of infinite trees 
in this case). Another possibility would be to investigate full monadic-second order logic, 
or possibly other quantifiers that can be added to weak monadic second-order logics. The 
techniques used in this paper are fairly generic, so it seems plausible that such quantifiers 
can be found. 
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